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Abstract 



This oaper describes work in progress on the use of visual scanning 
behavior as an indicator of pilot workload. The study is investigating the 
relationshio between Level jf performance on a constant piloting task under 
simulated IFR conditions, the skill or the pilot, tne level of mental 
workload induced by an additional verbal task imposed on the basic control 
task, and visual scanning behavior. 


The results indicate an increase in fixation dwell times. especially 
on the primary instrument with increased mental loading. Skilled subiects 
"stared" less under increased loading than did novice pilots. Sequences of 
instrument fixations were also examined. The percentage occurrence of the 
subiect’s most used sequences decreased with increased task difficulty for 
novice subiects but not for highly skilled subiects. 

Entropy rate (bits/sec) of the sequence of fixations was also used to 
quantify the scan pattern. It consistently decreased for most subiects as 
the four loading levels used increased. An exponential equation in task 
difficulty was found to be a good predictor of entropy rate. When solved 
for task difficulty, the equation provided an estimate of the level of task 
difficulty perceived by a subiect. 

Piloting and number task performance measures were recorded and a 
combined performance measure was computed. Skill was estimated 
independently via a method based on pilot experience. These measures were 
combined with entropy rate to develop a model relating performance, skill, 
and mental workload. The exponential model fit the data well enough to 
suggest that this approach has promise in the evaluation of interactions 
among these variables. 

Introduction 

The quantification of mental workload In aircraft pilots has been of 
considerable interest for some time. Perhaps the chief reason for - 
measuring workload is to predict conditions under which task performance 
will decrement. If such conditions could be accurately predicted, -then the 
nature and temporal sequence of flight procedures and of pilot/aircraft 
interfaces might be arranged, so as to minimize the chances of overload. 
Quantitative analyses of workload remain elusive however. What one would 
like is a olear cause and effeot relationship between an independent 

variation in imposed workload and some reliable dependent measure. 
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The task of flying an aircraft >is complex however, and it has been 
difficult to clarify the functional relationships between various 
oarameters in oiloting tasks. The skill a particular individual brings to 
the piloting task and tha nature of the task which is performed can both be 
axoected to influence tha “difficulty" of tha task. These factors may be 
further complicated by a shift in the pilot's priorities; (Some tasks may 
be ignored while others receive full attention). 


PERFORMANCE 


SKILL 



WORKLOAD 


Figure 1. INTUITIVE RELATIONSHIPS BETWEEN 
PERFORMANCE, SKILL,' & WORKLOAD 


The problems whioh such inter-relationships introduce is well 
illustrated whan one attempts to employ task performance as an indicator of 
workload. . AH pilots, regardless of skill, can be expected to exhibit poor 
performance if the loading level is excessive. The overload situation is 
relatively easy to assess. however. using subjective techrtlauas. 
Situations which involve intermediate to high levels of loading would seem 
to be the ones of more practical concern: i.e.. one is concerned with 

minimising the chance of a high workload approaching an overload situation. 
Intuition suggests that the level of skill of the pilot may influence the 
performance vs workload relationship for intermediate or marginal loading 
levels. A oilot of high skill would be expected to maintain "better" 
performance than a novice flyer under any loading condition short of 
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overload. This intuitive concept is illustrated graphically in figure i 

The research described here uses this graphical representation of the 
performance/slcill/workload relationships m order to pose a number of 
testable hypotheses. It will be suggested shortly that instrument scan may 
be an indicator of workload and/or skill in certain types of flight 
situations, a suggestion supported by both qualitative and quantitative 
results. In addition, if a measure of workload based on instrumnt scan is 
combined with independent measures of pilot skill and performance. then a 
model of the hypothetical relationships in figure 1 may be developed and 
tested. 

Visual Scanning Behavior 

The piiot has many sources of information input but the most important 
one during instrument flight is probably the visual oathwav. Under 
instrument flight conditions, some sensory inputs may even provide false 
information such as vertigo which results from conflicting visual and 
vestibular information. The pilot obtains information concerning aircraft 
state by cross-checking or scanning the flight instruments. The exact 
method of scanning the instrument panel varies from pilot to pilot but 
there are some basic features common to a "good" scan pattern. Indeed, it 
was the early study by Fitts and his associates on instrument transitions 
which led to the familiar ”T" arrangement of the mator flight instruments 
<Jones, et.al.. 1946). 

A fundamental notion n the present work is that a repetitive piloting 
task will invoke a regular visual scan fspatial/temporal pattern of eye 
movements) during instrument flight. If this notion is correct, then it 
may be postulated that external factors such as noise, interruptions, and 
fatigue which interfere with the piloting task may produoe measurable 
changes in the scanning behavior. Suoh a measure would be particularly 
attractive for quantifying workload since it would be both non-invasive and 
objective. 

Experimental Design 

A series of experiments is being carried in order to carefully examine 
these ideas. The basic .experiment is described in detail elsewhere (Tole. 
et ai. 1982) and only the salient points are repeated here. The 
experiments described were performed at the NASA/Langlev Research Center. 
Flight Management Branch, in Hampton. Virginia. - making use of their flight 
simulator and ooulometer facilities (Middleton, et.al., 3.977). 

* 

Three factors were manipulated in the experiments: 1) a piloting task 

requiring a stereotyped scan path. 2) a verbally presented mental loading 
task, and 3) a workload calibration side task. 

We : sought a representative constant piloting maneuver which might be 
realistically expected to oocur for periods of up to 10 minutes in actual 
flight. This run length was chosen as an estimate of the minimum amount of 
time required to provide a 'sufficient number of instrument fixations to 
satisfy the assumption of steady state conditions. The Instrument Landing 
System (ILS) approach is often chosen as the piloting task in studies of 
workload (Waller. 197S: Krebs and Wingect. 1978; Spady. 1977). Hovever. 
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the ILS approach represents a constantly changing task difficulty as 
touchdown is approached (especially due to increases in Glide slope 
sensitivity and cost of error for course deviation). This variation in the 
primary task loading makes it difficult to accurately control the amount of 
mental workload on the pilot as an independent variable. It was decided 
that a scenario in which glide slope sensitivity and heading were held 
constant would allow the piloting task difficulty to remain relatively 
constant for a long period, but nevertheless be more or less realistic. 

A desktop general aviation instrument flight simulator (Analog 
Training Computers ATC-51G) was used to simulate these flight manuevers. 
The ATC-51Q is a procedures trainer for light, single angina, fixed' pitch 
oroo. fined gear. IFR equipped aircraft. The simulator was equipped with a 
turbulence level control which was set to rhe first level above cairn 
conditions in order to force some pilot vigilance on the flight task. 

Pilot lookpoint an seven instruments (Attitude Indicator ’ATT'. 
Directional Gyro 'DG‘, Altimeter ‘ALT'. Vertical Speed Indicator 'VSI', 
Airspeed 'AS'. Turn and Sank ' *B*. and Glide Slope/Localirer 'GSL') was 
measured using a Honeywell oculometer system which has been substantially 
modified by NA3,A Langley Research Canter (Middleton, et.al., 1977) This 
device is non-mvasive and allows the user to determine the time course of 
aye fixations on instruments employed by the pilot and the dwell time of 
each fixation to the nearest 1/30 sec. 

The mental loading task was chosen so as not to directly interfere 
with the visual scanning of the pilot (i.e. the task would not require the 
pilot to look away from the instruments) while providing constant loading 
during the maneuver. The task used required the pilots to_ respond to a 
series of evenly spaced three-number sequences (Uittenborn. 1943) presented 
to them audibly by means of a speaker. The pilot was told that he must 
respond to each three-number sequence by indicating either "plus" or 
"minus" according to the algorithm : first number largest, second number 

smallest = "plus" (e.g. 5-2-4). last number largest, first number smallest 

= "plus" <a.g ; 1-2-3). otherwise, "minus" (e.g. 9-5-1). 

The mental workload experienced by the pilot is inversely proportional 
to the intervals between number sequences. This relationship is given by 
the following equation which is arbitrarily chosen: 

(1) TD » 1/interval between-^task 

where TO is equal to imposed task difficulty. The four loading levels used 
in the current experiments were intervals of continuous silence (i.e. 
no — numbers presented), ten. five, and two seconds which have corresponding 
task difficulties of 0.0. 0.1. 0.2. and Q.S'. respectively. 

Numbers were generated by a computer controlled speech synthesizer. 
This aliowed automated scoring of task accuracy, calculation of response 
reaction times, and the possibility of temporal correlations of visual or 
other responses with the verbal stimulus. The probabilities of occurence 
of " + " and sequences were each 0.5. The pilot was instructed to give 

the number, task priority equal to that of the piloting task as if the 
verbal' questions represented a constant rate of radio communication. 
Performance was recorded by having the pilot press a 3-position rocker 
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switch mounted on the yoke, up for plus and down for minus. 

The amount of mental loading imposed on the pilot by the number task 
was calibrated using a side task (Ephrath. 1975). The runs made with the 
side task were not used in the scanning analysis. however. due to the 
alteration of normal scanning caused by the task. The results (Tola, 
et.al.. 1982) from these runs confirmed the relative difficulty of the 
various number intervals. 

A microprocessor development system (Burns, at.al. 1980) was used for 
both stimulus presentation and data collection and analyses. 

Performance Measures 

Several variables ware obtained from each of the twotasks m order to 
allow the computation of performance scores. The scores developed Tin 
between 0 percent and 100 -percent^ with 100 percent being obtained if the 
pilot never deviated from the intended path m space on the piloting task, 
and if all number task sequences were answered ' correctly for the mental 
loading number task. The scores from the piloting and the mental loading 
tasks were then combined to provide a performance measure to be used in the 
validation or proposed perf ormance/skill/worklo ad model. 

The scoring measure for the number task was computed as given below. 

< TOT - WHO - MIS) 

(2) -#TF * x 100% 

TOT 

where 

TP =t mental loading number task performance 
TOT =* total number of stimuli presented 
WHO =i number of incorrect responses 
MIS a number of missed responses 

This score was 100 percent if the pilot answered every sequence correctly 
and zero percent if a pilot either answred incorrectly or missed all of 
the stimuli presented. Most subjects score nearly 100% on this task if 
they have nothing else to do simultaneously. 

The raw data available for scoring performance on the piloting task 
were the errors from the intended track for the glide slope and looalizer 
courses. Discussions, with several highly skilled pilots revealed that 
accuraoy of tracking the glide slope and localizer might not provide a 
complete performance picture. These pilots were willing to trade off 
“smoothness" when the loading task became more difficult; i.e. the pilot 
may perform the piloting task to the same level of accuracy, as far as 
deviations from a designated path are concerned, on two different runs but 
produce two very different ride qualities for these runs. One possible 
measure' for smoothness oould be the frequency of 'oscillation around the 
intended path. The higher this frequency is. the less "smooth" the ride 
becomes. It was arbitrarily assumed that a smooth ride would contain 
fraquecies mostly less than 0.1 Hz. Under this assumption, measurement of 
the spectral component of the aircraft dynamics above 0.1 Hz. would 
indicate any decrement in the ride quality. 
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In order to examine this measure, the power-spectral density (PSD) of 
the course deviations was computed. The bandwidth of the calculated PSD 
was 2.5 Hz. The "power 11 within a band of frequencies may be determined by 
integrating the PSD over that band (Schwarts. 1959). We chose to consider 
the % of the spectral power which was located in the band from Q.l to 2.5 
Hx. This was calculated by subtracting the power contained in the band 
from 0 to 0.1 Hx (assuming that the D.C. component was first removed) from 
the total power in the spectrum and multiplying by 100%. This % of the PSD 
was computed for both the glide slop* and the localizer and combined wth 
the two RMS measures to provide four candidate variables to be included in 
a performance score for the piloting task. 

Since the pilots were instructed .to give equal priority to the 
piloting task and the mental loading number task, both ware included m the 
development of a combined performance score. While a weighting of Q.5 
might have been assigned to each task. it was- decided to leave the 
weiahtmg'-free to allow the model fitting procedure to determine the 
relative weights. A linear relationship between all of the terms was 
assumed and the form of the aquation became. 

(3) P = CONST + a(jrTP) + bCRHS/GS) + c(RMS/LOC) 

+ dt%FWR/G5> + e(%PWR/LQC) 

where 

P = combined performance measure 
CONST =» constant term 
TP m mental loading number task performance 
RMS/GS m RMS error from glide slope track 
RMS/LOC * RMS error from localizer track 

%PWR/GS » percent of power from the power-spectral density for 
the glide slope greater tan 0.1 Herts 
%PWR/LOC « percent of power from the power-speotral density for 
the localizer greater than 0.1 Hertz 

Estimation of Pilot Skill levels 

In order to assess the effects of skill on performance and mental 
workload, an independent quantitative measure of skill was needed. K model 
of pilot skill based on experience factors was used for this purpose 
(Hollister, et al. 1973). This model was developed in order to predict the 
current level of skill. of pilots flying light, single engine aircraft. 

(4) Skill * 1.42 + 0.25(recency> * 0.73<loa(total time)) 

0.030<years certified) +■ 0.15<Xog(time in type)) 

- O.OOaS(age) + • 

where 

Skill ■ score refleoting relative piloting 
performance 

recency * number of flight hours in past 30 days 
total time * total number of flight hours 1 

time in type = total number of hours in light single engine aircraft 
■ years certified =« time in years since last certificate 

orating 

age = subjects 1 * age in years 

e a residual variance not explained by the model 



ORIGINAL PAGE 13 

OF POOR QUALITY 


Pa ,aa 7 


A raw skill score was calculated for each of the pilot subjects using 
the model. The pilot with the highest resulting skill score was then used 
to normalize all of the scores so that skill levels would range between G% 
and 100%. Eleven subiect3 ranging in skill from NASA test pilots to 
non-pilots participated in the experiments. The relative skill scores for 
the subjects are given in Table I. 

NASA PILOT# SKILL SCORE 


3 

100% 

4 

35 

11 

77 

13 

53 

15 

33 

B 

37 

12 

33 

14 

32 

3 

22 

7 

15 

16 

13 


TABLE t. 

Relative Skill Scores of Subjects based on Equation 4 


Though car* must b* taken when applying an equation such as this in a 
different set of experimental conditions, the overall rank ordering of the 
pilots by this method is probably accurate as it generally agreed with 
subjective rating of the pilot's skills by experienced observers at the 
NASA/Langlev Research Center. 

Conduct of the Experiments 

Each session consisted of four 10-minute runs with a 5-minute break 
between each run. The difficulty of the mental loading task would start at 
no numbers for the first run and increase to Z-sec intervals by the fourth 
run. Some subjects participated in two sessions, one without and on* with 
the side task. Each sublect was allowed to practice all three tasks until 
he felt comfortable with them. 


Preliminary Results 


Instrument dwell time histograms and the frequency of, usage of 
different sequences of instrument fixations were both affected by the 
loading task. Both results are reported in detail elsewhere (Tole. et.al.. 
1982) and only the major points are mentioned here. An increase in dwell 
time with increase in mental loading was observed in all subjects. This is 
illustrated in figure 2. Novice subjects generally had much longer dwell 
times under increased load than did skilled pilots. <ReIativ* skill levels 
are given in Table I above.) The fixation sequences of the pilot's 
instrument sans were analyzed, and the percentage occurrence of the ten 
most frequently occurring sequences were also analyzed. These results 
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Figure 2. DWELL TIME HISTOGRAMS FOR TWO SKILLED PILOTS (#4 & #11) 
AND TOO NOVICE PILOTS (#9 & #10) UNDER VARIOUS LOADING 
CONDITIONS 

i 

indicate that: I) skilled pilots use a higher percentage of their ten most 

frequently occurring sequences than do novice pilots and 2) the scan 
pattern of the novice subieots were affected more by the increase in mental 
loading than were the patterns of the highly skilled pilots. This result 
Is shown in figure 3. 


A more general method of quantifying the scan 

Traditionally, muoh of the quantitative analysis of scanning patterns 
has employed Markov transition probability matrices (Stark and Ellis. 1381: 
Krebs and Wingert. 1978). Such matrices do describe the predominant 
patterns in the scan via the relative sizes of transition probabilities but 
it is either extremely unwieldy or impossible to compare two of these 
matrices for different experimental conditions. One of the malor goals of 
this research is the identification of general methods for the study of 
scanning behavior. To be most useful the method should be independent of 
the number and arrangment of instruments. The nature of 
eve-point— of-regird data (sequential instrument and dwell times) obtained 
from the oculometer suggests several methods from information theory which 
may have this generality. 
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LOADING TASK 

Figure 3. PERCENT USAGE OF LENGTH 4 SEQUENCES UNDER VARYING 
LOAD (TYPICAL SEQ : ATT - DG - ATT - ALT) 


The piloting task in the current experiment is such that the pilot’s 
scan can only lie on one of the 7 specified instruments although eaoh 
fixation may be of arbitrary duration. The time history of fixations has a 
form which is similar to that of a communications system which can assume 7 
discrete states with a varying duration in each state. The orderliness of 
such a system is related to the probabilities with which, it occupys its 
different states. A system which always occupied the same state or always 
made the same transitions between states would thus be quite orderly. In 
the oast of instrument scan, these situations would be paralleled by 
staring and by a stereotyped scanpath respectively. 

This concept of system order may be stated compactly using the 
mathematical form for entropy from information theory. The entropy of a 
sequence is defined as (Shannon and Weaver. 1943): 
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H observed average entropy 

0 

p a probability of sequence i’ occurring 

1 

D a g^tof Different sequences in the scan 


In the case of the instrument scan, entropy has the units it 
bits/sequance and provides a measure of the randomness for orderliness) of 
:ne scanpath. The higher the entropy, the more disorder is present m ihe 
ecan- The maximum possible entroov is constrained by the experimental 
conditions (see below). The entropy measure uses the same probabilities 
which are present m transition matrices, but it yields a single, more 
compact expression for the overall behavior of the probabilities rather 
than oresancing them each individually This method appears to afford some 
generality and has been the focus of our recent efforts. 

.To implement this method, each of the instruments to be examined was 
given a number. Then a sequence of these numbers was stored as the pilot 
scanned the instrument panel together with the dwell time for each 
fixation. While sequences of up to length 4 were considered in preliminary 
analyses, the most detailed study was made on sequences of length 2. The 
remainder of the discussion here applies to the results for length 2 
sequences. Details of themethodolgy are given elsewhere {Stephens, 1931). 

It can be shown that the observed entropy for the instrument scan is 
related to the total number of fixation sequences <L. defined with equation 
7 below) observed during a run. In order to compare entropies from the 
scans of different pilots for different run lengths, each estimate of 
entropy had to be corrected for L and normalized to its maximum possible 
value, Hmax. Hmax may be calculated as follows. In the most general case. 
M instruments may be arranged in some arbitrary fashion on the cockpit 
panel. For' a given number of instruments. M. and sequence length N. the 
maximum number of different fixation sequences is given by: 


N-l 

(8) Q »• MCM-1) - maximum number of sequences of length N 

The number of bits required to uniquely encode all Q possible sequences is 
log2 Q. The magnitude of this latter number also represents Kmax of the 
visual scan for the number of instruments an sequence length being 
considered. For example, with 7 instruments the value of Q for sequences 
of 2 instruments is 56 which yields a corresponding. Hmax • 5.8. 

The. normalized value of H may than be calculated from: 
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Hmax 

(7) Hcorr =» Ho * — ' 

Lo a L 
2 

where 

L = R-N + l = number of sequences in a run 
R a number of fixations in a run 
N = sequence length (N =• 1.2.3, or 4) 


While entropy should help to explain the orderliness (or lack thereof) 
of the scanning pattern, the development presented up to this point does 
not include the fact that the dwell time for each fixation is different. 
From the preliminary results on instrument dwells, it appears rather clear 
that dwell times can’ be markedly affected during high mental loading. In 

irder to tnclude the effect of time in >ur measure, a term for entropy 
was defined as: 

(9) Hrate = Ho/t 

where Ho is the entropy for the system given by 7 snd t = smallest interval 
tn which a transition may occur. 

In practice, the calculation of Hrate was an average value given by 
the following: 


D 

(9) Hrate Hcorr /DT 

ava i = l i i 

where 

Hcorr * Normalised entropy for ith sequence 
i 

DT m Average Dwell time for ith sequence 
i 

D = of different fixation sequences 


It is helpful to estimate the maximum value which Hrate might assume. 
This may be calculated using the maximum for entropy determined above 
together with dwell time statistics for the various instrument sequences in 
the scan. Vhile it is possible for pilots to make rather rapid glances 
(with dwell times of 100 msec or less) at their instruments (Harris and 
Christhilf. 1980) a fixation rate this high (10 fixations/sec) rapidly 
leads to oculomotor fatigue. A morerealistio average value is probably 
about 2 fixations/seo or less for a long period of instrument scan (say > 

10 sac). 

Using 0.5 seo/look (2 fixations/seo.) as the average dwell interval, 
the maximum entropy rate for sequences of length 2 is calculated to be 

Hrate = 5. 8/0. 5 * 2 fixations/seq. ** 6 bits/seo 
max 

This number represents an upper bound. Since we suspect that the pilot 
must have some regularity in his or her scan, the numbers we would” expect 
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to obtain under actual flight conditions will probably be lower. The 
observed average Hrate for the current experiments was on the order of 1 
bit/sec. A tendency to stare under increased load should be reflected by 
decreased entropy and increased fixation times making Hrate tend toward 
lower values under such conditions. Figure 4 plots Hrate vs number Task 
Difficulty for all pilots except 12 and 8. 



Figure 4. ENTROPY RATE ON LENGTH 2 SEQUENCES vs. 
IMPOSED TASK DIFFICULTY 


A trend toward lower entropy rate with higher task difficulty may be seen. 

A two-way analysis of variance was performed for the entropy rate data from 
nine pilots on levels of task difficulty and between subiects. F tests 
allowed reiection of two null hypotheses: equality of mean Hrate at all 

loading levels (p < 0.01) and equality of mean Hrate between subiects <p < 
0.01). AH sis combinations of level differences in mean Hrate were found 
to be statistically significant <T-test p < 0.05). Thus Hrate was chosen 
to map from scanning behavior into task difficulty Ci.e. workload). ' 

The model used expresses Hrate as an exponential function of TD. 

<10) Hrate = 0.9279 EXP<-TD> 


This equation was obtained via a regression analysis based on the data from 
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seven of the pilots with a coefficient of determination. R-squarad. = 
97.3’A This equation mav be solved for task difficulty withthe following 
results: 

(11) TD 3 —(0.06 + In Hrata). 

This expression can then be used to predict the level of TD for a new 
subiact under the conditions of the experiment reported here. 

Model Development and Verification 

One of the maior goals of this work was the development of a modal 
relating performance, skill, and mental workload. The ultimata goal is the 
crediction of oerfocmanca given estimates for skill and scanning 
oirametars. A model relating performance, skill, and mental workload may 
be postulated from l ha amoirical relationship shown ir*. flours i. 
Construction of the model should, m fact, aid in deforming whether such 
empirical expressions are valid. The model chosen was an exponential form: 


2 

(12) ? = P .0) - EXP ((TD ’/Skill) ) 

This equation may be rearranged as follows: 

2 

(13) EXP ((TD/Skill) ) = P(0> - P 

which states that the exponential term is equal to the difference in te 
performance at the no-loading level P(0) and the performance at the present 
level of mental loading P: Using the values for the level of skill and 

task difficulty calculated in equations 4 and 11 respectively, the left 
hand side of the equation may be computed. The right hand side of the 
equation must be expressed in terms of measurable performance indicators. 

Expanding the right side of (13> yields 

(14) P(0> - P = a( 5 ^TP<0) -^TP) + b(RMS/GS(Q> - RMS/GS) 

+ o(RMS/LOC(0) - RMS/LOC) + d(%PVR/GS(G> - ttPWR/GS> 

+ e(*PWR/LOC(0> - %FWR/LOC> 


A multiple regression analysis was then performed on equation 13 using 
values for each of these measures recorded during the experiments. 

The data from seven- pilots was used for model development, while that 
from three other subjects was used for model verification. One pilot's 
performance data was discarded due to equipment malfunction. 

The results of the first attempt at rearessin indicated that the 
coefficient of the %PVR/LOC term could not be differentiated from rtro 
based on a Student's T-test. This variable was eliminated from aquation 13 
and the analysis was repeated. This regression yielded non-zero values for 
the coefficients a through d. and included a constant term. The resulting 
equation was: 
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2 

(15> EXP((TD/Skill) > = 1.4483 + 0.0351^-TPCO) -?£tF) 

+ 0.17S5(RMS/GS(0) - RMS/GS)- Q.0366(RMS/LOC(0) - RM5/L0C) 
+ 0.Q377<%PUR/GS(0) - '’AFVR/GS) 

This analysis had an R squared value of 75.6 percent and <»n F— ratio of 
12.28 <p < 0.01). The coefficients determined for 15 may now be used in 
equation 3 which becomes 

(16) P = 1.4483 + 0.0351^TP) +■ 0.1765CRMS/GS) 

- 0.03 66 (RMS /LOC) + Q.Q377(*PWR/GS). 

These coefficients provide the- relative weightings for each of the 
performance terms but they need to be scaled in order to provide the proper 
characteristics for the equation. If each of the terms were at their 
maximum value, that is 100 percent, then the combined performance measure 
should also aquai 100 percent. However, using the coefficient this iOQ 
percent. each coefficient must be multiplied by 1GQ./22.7Z = 4 40. The 
modified performance equation becomes: 

>>17) P •= 6.3730 + 0 1545#TP) + 0.77S9(RMS/GS> - 0. IS 1 1 (RMS /LO C) 

+ 0.1655(%FWR/GS> 

A plot of this fuction versus the task difficulty, obtained from equation 
11. is provided in Figure 5. 

It was hoped that these curves would resemble those given in the 
hypothetical plot in Figure 1 and for some of the pilots, a general overall 
downward trend is present. Evan though the curves do not match the 
hypothetical ones exactly, there are some common features between them. 
First of all. the curve for the lowest skilled pilot 7 is seen to decrease 
much more rapidly than the curves forthe more highly skilled pilots < 3. 

11; the two points for 3 are for the third and highest levels of mental 
loading respectively). 

To test this model’s value as a predictive tool, the data from three 
subiects not included in the model determination, were substituted into 
equation 17 and plotted versus perceived task difficulty in Figure 6. 

Pilots 12. 8. and 16 produce soma interesting, if not consistent 
results. The three points of pilot 12. and pilot 18 are for the second, 
third, and highest loading levels. All three pilots show a net decrease in 
performance between their lowest and highest task difficulties even though 
they accomplished this decrease in very different ways. Pilot 8 appears 
to be the closest to the theoretical model with his sharp deorease in 
performance over a very small task difficulty increase. Pilot IS, on the 
other hand, appears to be decreasing at an exponentially decreasing rate as 
opposed to the model which predicts reasing performance at an 
exponentially increasing rate. Pilot 12 increases performance sharply 
between his second and third runs and then decreases lust as sharply 
between the third and fourth runs. 

Since the choice of the exponential model for 

perf ormance/sJcill/workload was arbitrary, two other form* for the model 
were also examined. These ware circular and linear models and neither was 
as good at fitting the data as the exponential and hence were abandoned. 




Figure 5, Combined performance (from model) perceived task 
difficulty for 7 pilots used in model development 



Figure 6. Combined performance vs. task difficulty for 3 test 
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The models described hare are still under development and work is in 
progress to repeat the experiments described here and to apply this 
methodology to other instrument flight scenarios. 

Summary 

This paper presents soma of the findings frm a sat of experiments 
designed to explore the relationship between performance, skill, and visual 
scanning behavior of aircraft pilots under varying levels of mental 
workload. Instrument fixations were recorded as a group of pilots with 
widely varying levels of skill simultaneously performed a constant 
instrument flight task and a verbally presented loading task with 4 
discrete levels. Initial results indicate a tendency of lesser skilled 
pilots to stare at the primary instrument as loading is increased and to 
alter the frequency of usage of different scan paths. Skilled pilots 
demonstrated much lass change on both of these measures 

A raaior finding of the research suggests that under relatively 
constant instrument flight conditions the entropy rate of the visual scan 
path may be a useful measure- of the level of mental workload induced by a 
constant rate verbal task This measure of workload was combined with 
independent estimates of performance on the piloting and verbal tasks and 
of pilot skill. An exponential model relating these factors was developed 
and has undergone preliminary tests. The model helps provide insight on 
the intimate connections between a particular workload measure and operator 
skill and performance strategy. 
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